Finding Approximate Tandem Repeats with the Burrows-Wheeler Transform
نویسندگان
چکیده
Approximate tandem repeats in a genomic sequence are two or more contiguous, similar copies of a pattern of nucleotides. They are used in DNA mapping, studying molecular evolution mechanisms, forensic analysis and research in diagnosis of inherited diseases. All their functions are still investigated and not well defined, but increasing biological databases together with tools for identification of these repeats may lead to discovery of their specific role or correlation with particular features. This paper presents a new approach for finding approximate tandem repeats in a given sequence, where the similarity between consecutive repeats is measured using the Hamming distance. It is an enhancement of a method for finding exact tandem repeats in DNA sequences based on the BurrowsWheeler transform. Keywords—approximate tandem repeats, Burrows-Wheeler transform, Hamming distance, suffix array
منابع مشابه
Approximate Pattern Matching Using the Burrows-Wheeler Transform
The compressed pattern matching problem is to locate the occurrence(s) of a pattern P in a text string T, using a compressed representation of T, with minimal (or no) decompression. In this paper, we consider approximate pattern matching on the text transformed by the Burrows-Wheeler Transform (BWT). This is an important first step towards developing compressed pattern matching algorithm for BW...
متن کاملOutput distribution of the Burrows - Wheeler transform ' Karthik
The Burrows-Wheeler transform is a block-sorting algorithm which has been shown empirically to be useful in compressing text data. In this paper we study the output distribution of the transform for i.i.d. sources, tree sources and stationary ergodic sources. We can also give analytic bounds on the performance of some universal compression schemes which use the Burrows-Wheeler transform.
متن کاملImprovements to the Burrows-Wheeler Compression Algorithm: After BWT Stages
The lossless Burrows-Wheeler Compression Algorithm has received considerable attention over recent years for both its simplicity and effectiveness. It is based on a permutation of the input sequence − the Burrows-Wheeler Transform − which groups symbols with a similar context close together. In the original version, this permutation was followed by a Move-To-Front transformation and a final ent...
متن کاملFUNCTIONAL PEARLS Inverting the Burrows-Wheeler Transform
Our aim in this pearl is to exploit simple equational reasoning to derive the inverse of the Burrows-Wheeler transform from its specification. We also outline how to derive the inverse of two more general versions of the transform, one proposed by Schindler and the other by Chapin and Tate.
متن کاملAttacking Scrambled Burrows-Wheeler Transform
Scrambled Burrows-Wheeler transform [6] is an attempt to combine privacy (encryption) and data compression. We show that the proposed approach is insecure. We present chosen plaintext and known plaintext attacks and estimate their complexity in various scenarios.
متن کامل